An optimality principle for Markovian decision processes
نویسندگان
چکیده
منابع مشابه
Bias Optimality for Multichain Markov Decision Processes
In recent research we find that the policy iteration algorithm for Markov decision processes (MDPs) is a natural consequence of the performance difference formula that compares the difference of the performance of two different policies. In this paper, we extend this idea to the bias-optimal policy of MDPs. We first derive a formula that compares the biases of any two policies which have the sa...
متن کاملAn Optimality Principle for Concurrent Systems
This paper presents a formulation of an optimality principle for a new class of concurrent decision systems formed by products of deterministic Markov decision processes (MDPs). For a single MDP, the optimality principle reduces to the usual Bellman’s equation. The formulation is significant because it provides a basis for the development of optimisation algorithms for decentralised decision sy...
متن کاملAn Optimality Principle for Unsupervised Learning
We propose an optimality principle for training an unsupervised feedforward neural network based upon maximal ability to reconstruct the input data from the network outputs. We describe an algorithm which can be used to train either linear or nonlinear networks with certain types of nonlinearity. Examples of applications to the problems of image coding, feature detection, and analysis of random...
متن کاملNon-Deterministic Policies in Markovian Decision Processes
Markovian processes have long been used to model stochastic environments. Reinforcement learning has emerged as a framework to solve sequential planning and decision-making problems in such environments. In recent years, attempts were made to apply methods from reinforcement learning to construct decision support systems for action selection in Markovian environments. Although conventional meth...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Journal of Mathematical Analysis and Applications
سال: 1976
ISSN: 0022-247X
DOI: 10.1016/0022-247x(76)90243-2